The Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization
نویسندگان
چکیده
We propose a new high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients, subject to a budget on the maximal absolute correlation between the features and residuals. We show that the estimator can be expressed as a solution to a Mixed Integer Linear Optimization (MILO) problem, a computationally tractable framework that delivers provably optimal global solutions. The current state of algorithmics in integer optimization makes our proposal substantially more scalable than the least squares subset selection framework based on integer quadratic optimization, recently proposed in [7] and the continuous nonconvex quadratic optimization framework of [34]. We propose new discrete first-order methods, which, when paired with state-of-the-art MILO solvers, lead to superior upper bounds for the Discrete Dantzig Selector problem for a given computational budget. We demonstrate that the integrated approach, proposed herein, also provides globally optimal solutions in significantly shorter computation times, when compared to off-the-shelf MILO solvers. We demonstrate, both theoretically and empirically, that, in a wide range of regimes, the statistical properties of the Discrete Dantzig Selector are superior to those of popular `1-based approaches. Our approach gracefully scales to problem instances upto p = 10, 000 features with provable optimality, making it, to the best of our knowledge, the most scalable exact variable selection approach in sparse linear modeling at the moment.
منابع مشابه
Dantzig selector homotopy with dynamic measurements
The Dantzig selector is a near ideal estimator for recovery of sparse signals from linear measurements in the presence of noise. It is a convex optimization problem which can be recast into a linear program (LP) for real data, and solved using some LP solver. In this paper we present an alternative approach to solve the Dantzig selector which we call “Primal Dual pursuit” or “PD pursuit”. It is...
متن کاملDantzig Selector with an Approximately Optimal Denoising Matrix and its Application in Sparse Reinforcement Learning
Dantzig Selector (DS) is widely used in compressed sensing and sparse learning for feature selection and sparse signal recovery. Since the DS formulation is essentially a linear programming optimization, many existing linear programming solvers can be simply applied for scaling up. The DS formulation can be explained as a basis pursuit denoising problem, wherein the data matrix (or measurement ...
متن کاملOn High Dimensional Post-Regularization Prediction Intervals
This paper considers the construction of prediction intervals for future observations in high dimensional regression models. We propose a new approach to evaluate the uncertainty for estimating the mean parameter based on the widely-used penalization/regularization methods. The proposed method is then applied to construct prediction intervals for sparse linear models as well as sparse additive ...
متن کاملHigh-dimensional stochastic optimization with the generalized Dantzig estimator
We propose a generalized version of the Dantzig selector. We show that it satisfies sparsity oracle inequalities in prediction and estimation. We consider then the particular case of high-dimensional linear regression model selection with the Huber loss function. In this case we derive the sup-norm convergence rate and the sign concentration property of the Dantzig estimators under a mutual coh...
متن کاملOn robust width property for Lasso and Dantzig selector
Recently, Cahill and Mixon completely characterized the sensing operators in many compressed sensing instances with a robust width property. The proposed property allows uniformly stable and robust reconstruction of certain solutions from an underdetermined linear system via convex optimization. However, their theory does not cover the Lasso and Dantzig selector models, both of which are popula...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Information Theory
دوره 63 شماره
صفحات -
تاریخ انتشار 2017